RAW SEQUENCE LISTING 



ERROR REPORT 



The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 

Application Serial Number: 

Source; ^^=(A) & ' 

-Date Processed b)*STIC: \ ^ ^Q| q^ 

THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION AND PATENTIN SOFTWARE QUESTIONS, PLEASE CONTACT 
MARK SPENCER, TELEPHONE: 703-308-4212; FAX: 703-308-4221 
Effective J 2/1 3/03 : TELEPHONE: 571-272-2510; FAX: 571-273-0221 



TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 41 PROGRAM , ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE SEE BELOW FOR ADDRESS; 

http:/Avvvw.usnto t govAvcb/ofriccs/pac/cliccker/chkr4 lnotc.htm 

#s 

Applicants submitting genetic sequence information clcctroni*a% on disklftc or CD-Rom should be aware (hat (here 

a possibility that the disk/CD-Roin may have been affected by treatment given to all incoming mail. 

Please consider using alternate methods of submission for the disk/CD-Rom or replacement disk/CD-Rom. 

Any reply including a sequence listing in electronic form should NOT be sent to the 2023 1 zip code address for the 

United Slates Patent and Trademark Office, and instead should be sent via the following to the indicated addresses: 

1 EFS-Dio (<httn:/Avww.iisnto,gov/cbc/cfs/(lownloads/(Iocumcn(s.htm> , EFS Submission 
User Manual - cPAVE) 

2 U.S. Postal Service: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 223 13-1450 

3 Hand Carry directly to (EFFECTIVE 12/01/03): 

U.S. Patent and Trademark Office, Box Sequence, Customer Window, Lobby, Room 1B"03, Crystal Plaza Two, 
20 1 1 South Clark Place, Arlington, VA 22202 
4. Federal Express, United Parcel Sen ice, or-othcr delivery service to: U"S Patent and Trademark Office, 
Box Sequence, Room 1B03-Mailroom, Crystal Plaza Two, 20 1 1 South Clark Place, Arlington, VA 22202 ' 



Revised 10/08/03 



Raw Sequence Listing Error Summary 



ERROR DETECTED 
ATTN: NEW jtfjLES CASES 




BER:. 



I //Wrapped Nuclcics 
Wrapped Aminos 



10 



11 



12 



SUGGESTED CORRECTION SERIAL NUMBER:. 

: PLEASE DISREGARD ENGLISH "ALPHA* HEADER WHICH WERE INSERTED BY PTO SOFTWARE 

The number/text at the end of each line "wrapped" down to the next line. This may occur if your file 
was retrieved in a word pr6cessor after creating it. Please adjust your right margin to .3; (his will 
prevent "wrapping." 



Jnvalid Line Length The rules require that a line not exceed 72 characters in length. This includes white spaces. 



3 Misaligned Amino 

Numbering 

4 Non-ASCII 



Variable Length 



Patentln 2.0 

"bug" 



Skipped Sequences 
(OLD RULES) 



The numbering under each 5 th amino acid is misaligned. Do not use tab codes between numbers; 
use space characters, instead. 

The submitted fiie was not saved in ASCII(DOS) text, as required by the Sequence Rules. Please 
ensure your subsequent submission is saved in ASCII text. 

Sequence(s) contain n's or Xaa's representing more than one i^sidue. Per Sequence Rules, 

each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing 

A "bug" in Patentln version 2 0 has caused the <220>-<223> section to be missing from amino acid 

scquences(s) _. Normally, Patentln would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artificial or Unknown sequences. 

Sequencc(s) missing. If intentional, please insert the following lines for each skipped sequence 

(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 
(xi) SEQUENCE DESCRIPT10N:SFQ ID NO:X: (insert SEQ ID NO where "X" is shown} 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to include the skipped sequences- 



Skipped Sequences 
(NEW RULES) 



Use of n's or Xaa's 
(NEW RULES) 



Jnvalid <2I3> 
Response 



Useof<220> 



>atentln 2.0 
"bug" 



Sequence(s) 



<2I0> sequence id number 
<400> sequence 1 id number 
000 



missing. If intentional, please insert the following lines for each skipped sequence 



Use of n's and/or Xaa's have been detected mTffiTScqucnce Listing. 

Per 1.823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's arc present. 

In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <2I3> responses arc: Unknown, Artificial Sequence, or 
scientific name (Genus/species) <220>-<223> section is required when <213> response is Unknown or 
is Artificial Sequence 



Sequcncc(s) 



missing the <220> "Feature" and associated numeric identifiers and responses. 



Use of <220> to <223> is MANDATORY if <213> "Organism" responses "Artificial Sequence" or 

"Unknown." Please explain source of genetic material in <J20> to <223> section. 

(See "Federal Register," OtfTO I /1 998, Vol. 63, No. 104, pp. 2963J^32) (Sec. 1.823 of Sequence Rules) 

Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use "File Manager" or any other manual means to copy file to floppy disk. 



13 



Misuse of n/Xaa "n" can only represent a single nucleotide ; "Xaa" can only represent a single amino acid 



AMC - Biotechnology Systems Branch - 09/09/2003 
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IFWO 



RAW SEQUENCE LISTING DATE: 01/30/2004 

PATENT APPLICATION: US/10/761 , 006 TIME: 14:36:38 

Input Set : A:\seqlist.txt 
Output Set: N:\CRF4\01302004\J761006.raw 

SEQUENCE LISTING 

4 (1) GENERAL INFORMATION: 

APPLICANT: Oon, Chong Jin 
Lim, Gek Keow 
Zhao, Yi 
Chen, Wei Ning 

TITLE OF INVENTION: A MUTANT HUMAN HEPATITIS B VIRAL STRAIN P 

USES THEREOF 
NUMBER OF SEQUENCES: 11 
CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Ladas & Parry 

(B) STREET: 26 West 61 Street 

(C) CITY: New York 

(D) STATE: New York ^0©S MQt COfflp/y 



6 


(i) 


7 




8 




9 




11 


(ii) 


12 




14 


(iii) 


16 


(iv) 


17 




18 




19 




20 




21 




22 




24 


(v) 


25 




26 




27 . 




28 




30 • 


(vi) 



(E) COUNTRY: USA 

(F) ZIP: 10023 f _ 
COMPUTER READABLE FORM: - P5' ~ 

(A) MEDIUM TYPE: Floppy disk 1 ' 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

C — > 31 (A) APPLICATION NUMBER: US/10/761,006 

C — > 32 (B) FILING DATE: 20-Jan-2004 

33 (C) CLASSIFICATION: 435 

35 (vii) PRIOR APPLICATION DATA: 

36 (A) APPLICATION NUMBER: PCT/SG98 /0004 6 

37 (B) FILING DATE: 19-JAN-1998 

39 (viii) ATTORNEY/AGENT INFORMATION: 

40 (A) NAME: Mass, Clifford J. 

41 (B) REGISTRATION NUMBER: 30,086 

42 (C) REFERENCE/DOCKET NUMBER: U-014987-0 
4 4 (ix) TELECOMMUNICATION INFORMATION: 
45 (A) TELEPHONE: (212) 708-1800 



ERRORED SEQUENCES 

4 8 (2) INFORMATION FOR SEQ ID NO: 1: 

50 (i) SEQUENCE CHARACTERISTICS: 

51 (A) LENGTH: 3215 base pairs 

52 (B) TYPE: nucleic acid 

53 (C) STRANDEDNESS: double 
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RAW SEQUENCE LISTING DATE: 01/30/2004 

PATENT APPLICATION: US/10/761,006 TIME: 14:36:38 

Input Set : A:\seqlist.txt 

Output Set: N:\CRF4\01302004\J761006.raw 

54 (D) TOPOLOGY: circular 

58 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

E— > 60 CTCCACAACA TTCCACCAAG CTCTGCTAGA TCCCAGGGTG AGGGGCC TAT 

61 ATTTTCCTGC 60 
E — > 63 TGGTGGCTCC AGTTCCGGAA CAGTAAACCC TGTTCCGACT ACTGCCTCTC 

64 CCATATCGTC 120 
E-- -> 66 AATCTTCTCG AGGACTGGGG ACCCTGCACC GAACATGGAG AACACAACAT 

67 CAGGATTCCT 180 
E— > 69 AGGACCCCTG CTCGTGTTAC AGGCGGGGTT TTTCTCGTTG ACAAGAATCC 

70 TCACAATACC 24 0 

E — > 72 GCAGAGTCTA GACTCTGGTG GACTTCTCTC AATTTTCTAG GGGGAGCACC 

73 CACGTGTTCC 300 
E— > 75 TGGCCAAAAT TCGCAGTCCC CAACCTCCAA TCACTCACCA ACCTCTTGTC 

7 6 CTCCAATTTG 360 
E — > 78 TCCTGGCTAT CGCTGGATGT GTCTGCGGCG TTTTATCATA TTCCTCTTCA 

7 9 TCCTGCTGCT 420 
E~> 81 ATGCCTCATC TTCTTGTTGG TTCTTCTGGA CTACCAAGGT ATGTTGCCCG 

82 TTTGTCCTCT 480 
E— > 84 ACTTCCAGGA ACATCAACCA CCAGCACGGG GCCATGCAAG ACCTGCACGA 

85 CTCCTGCTCA 5 40. 

E~> 87 AGGAAACTCT ACGTTTCCCT CTTGTTGCTG TACAAAACCT TCGGACGGAA 

88 ACTGCACTTG 600 
E — > 90 TATTCCCATC CCATCATCCT GGGCTTTCGC AAGATTCCTA TGGGAGTGGG 

91 CCTCAGTCCG 660 
E— > 93 TTTCTCCTGG CTCAGTTTAC TAGTGCCATT TGTTCAGTGG TTCGTAGGGC 

94 TTTCCCCCAC 720 
E— > 96 TGTTTGGCTT TCAGTTATAT GGATGATGTG GTATTGGGGG CGAAGTCTGT 

97 ACAACATCTT 780 
E — > 99 GAGTCCCTTT TTACCTCTAT TACCAATTTT CTTTTGTCTT TGGGTATACA 

100 TTTAAACCCT 840 
E — > 102 AATAAAACCA AACGTTGGGG CTACTCCCTT AACTTCATGG GATATGTAAT 

103 TGGAAGTTGG 900 
E — > 105 GGTACTTTAC CGCAGGAACA TATTGTACTA AAACTCAAGC AATGTTTTCG 

106 AAAACTGCCT 960 
E — > 108 GTAAATAGAC CTATTGATTG GAAAGTATGT CAAAGAATTG TGGGTCTTTT 

109 GGGCTTTGCT 1020 
E — > HI GCCCCTTTTA CACAATGTGG CTATCCTGCC TTGATGCCTT TATATGCATG 

112 TATACAATCT 1080 
E~> 114 AAGCAGGCTT TCACTTTCTC GCCAACTTAC AAGGCCTTTC TGTGTAAACA 

115 ATATCTGAAC 114 0 
E — > 117 CTTTACCCCG TTGCCCGGCA ACGGTCCGGT CTCTGCCAAG TGTTTGCTGA 

118 CGCAACCCCC 1200 
E — > 120 ACTGGATGGG GCTTGGCCAT AGGC CAT CAG CGCATGGCTG GAACCTTTCT 

121 GGCTCCTCTG 1260 
E — > 123 CCGATCCATA CTGCGGAACT CCTAGCAGCT TGTTTTGCTC GCAGCCGGTC 

124 TGGAGCAAAA 1320 
E — > 126 CTTATCGGAA CCGACAACTC TGTTGTCCTC TCTCGGAAAT ACACCTCCTT 

127 TCCATGGCTG 1380 
E — > 129 CTAGGGTGTG CTGCCAACTG GATCCTGCGC GGGACGTCCT TTGTCTACGT 
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RAW SEQUENCE LISTING DATE: 01/30/2004 

PATENT APPLICATION: US/10/761,006 TIME: 14:36:38 

Input Set : A:\seqlist.txt 

Output Set: N:\CRF4\01302004\J761006.raw 

130 CCCGTCGGCG 1440 
E — > 132 CTGAATCCCG CGGACGACCC GTCTCGGGGC CGTTTGGGGC TCTACCGTCC 

133 C.CTTCTTCAT 1500 
E— > 135 CTGCCGTTCC GGCCGACCAC GGGGCGCACC TCTCTTTACG CGGTCTCCCC 

136 GTATGTGCCT 1560 
E — > 138 TCTCATCTGC CGGACCGTGT GCACTTCGCT TCACCTCTGC ACGTCGCATG 

139 GAGACCACCG 1620 
E — > 141 TGAACGCACG CCAGGTCTTG CCCAAGGTCT TATATAAGAG GACTCTTGGA 

142 CTCTCAGCAA 1680 
E — > 144 TGTCAACGAC CGACCTTGAG GCATACTTCA AAGACTGTGT GTTTAAAGAC 

145 TGGGAGGAGT 174 0 
E — > 147 TGGGGGAGGA GAT TAGGTTA AAGATTTATG T AC T AGGAGG CTGTAGGCAT 

148 AAATTGGTCT 1800 
E — > 150 GT TCACCAGC ACCATGCAAC TTTTTCTCCT CTGCCTAATC ATCTCATGTT 

151 CATGTCCTAC 18 60 
E — > 153 TGTTCAAGCC TCCAAGCTGT GCCTTGGGTG GCTTTGGGAC ATGGACATTG 

154 ACCCGTATAA 1920 
E — > 156 AGAATTTGGA GCATCTGCTG AGTTACTCTC TTTTTTGCCT TCTGACTTCT 

157 TTCCGTCTAT 1980 
E — > 159 TCGAGATCTC CTCGACACCG CCTCTGCTCT GTATCGGGAG GCCTTAGAGT 

160 CTCCGGAACA 204 0 
E — > 162 TTGTTCGCCT CAC CAT ACAG CACTCAGGCA AGCTATTTTG TGTTGGGGTG 

163 AGTTGATGAA 2100 
E — > 165 TCTGGCCACC TGGGTGGGAA GTAATTTGGA AGATCCAGCA TCCAGGGAAT 

166 TAG TAG TC AG 2160 
E — > 168 CTATGTCAAC GTTAATATGG GCCTAAAACT CAGACAAATA TTGTGGTTTC 

169 ACATTTCCTG 2220 
E — > 171 TCTTACTTTT GGAAGAGAAA CTGTTCTTGA GTACTTGGTA TCTTTTGGAG 

172 TGTGGATTCG 2280 
E — > 174 CACTCCTACC GCTTACAGAC CACCAAATGC CCCTATCTTA TCAACACTTC 

17 5 CGGAAACTAC 234 0 
E — > 177 TGTTGTTAGA CGACGAGGCA GGTCCCCTAG AAGAAGAACT CCCTCGCCTC 

178 GCAGACGAAG 24 00 
E — > 180 GTCTCAATCG CCGCGTCGCA GAAGATCTCA ATCTCGGGAA TCTCAACGTT 

181 AGTATTCCTT 24 60 
E — > 183 GGACTCATAA GGTGGGAAAC TTTACTGGGC TTTATTCTTC TACTGTACCT 

184 GTCTTTAATC 2520 
E — > 186 CCGAGTGGCA AATTCCTTCC TTTCCTCACA TTCATTTACA AGAGGACATT 

187 ATTAATAGAT 2580 
E — > 189 GTCAACAATA TGTGGGCCCT CTTACAGTTA ATGAAAAAAG AAGATTAAAA 

190 TTAATTATGC 2 64 0 
E — > 192 CTGCTAGGTT TTATCCTAAC CTTACTAAAT ATTTGCCCTT AGACAAAGGC 

193 ATTAAACCGT 2700 
E — > 195 ATTATCCTGA ACATGCAGTT AATCAT TACT TCAAAACTAG GCAT TATTTA 

196 CATACTCTGT 27 60 
E — > 198 GGAAGGCTGG CATTCTATAT AAGAGAGAAA CTACACGCAG CGCCTCATTT 

199 TGTGGGTCAC 2820 
E— > 201 CATATTCTTG GGAACAAGAG CTACAGCATG GGAGGTTGGT CTTCCAAACC 

202 TCGACAAGGC 2880 
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RAW SEQUENCE LISTING DATE : 01/30/2004 

PATENT APPLICATION: US/10/761 , 006 TIME: 14:36:38 

j 

Input Set : A:\seqlist.txt 

Output Set: N:\CRF4\01302004\J761006.raw 



E— > 


204 


AT GGGGAGCA 


ATCTTGCTGT 


TCCCAATCCT 


CTGGGATTCT 


TTCCCGATCA 




205 


CCAGTTGGAC 


2940 








E— > 


207 


CCTGCGTTCG 


GAGCCAACTC 


AAACAATCCA 


GATTGGGACT 


TCAACCCCAA 




208 


CAAGGATCAC 


3000 








E— > 


210 


TGGCCAGAGG 


CAAATCAGGT 


AGGAGTGGGA 


GCATTCGGGC 


CAGGGTTCAC 




211 


CCCACCACAC 


3060 








E— > 


213 


GGCGGTCTTT 


TGGGGGGGAG 


CCCTCAGGCT 


CAGGGCATAT 


TGACAACAGT 




214 


GCCAGCAGCA 


3120 








E — > 


216 


CCTCCTCCTG 


CCTCCACCAA 


TCGGCAGTCA 


GGAAGACAGC 


CTACTCCCAT 




217 


CTCTCCACCT 


3180 










219 


CTAAGAGACA 


GTCATCCTCA 


GGCCACGCAG 


TGGAA 





3215 
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VERIFICATION SUMMARY DATE: 01/30/2004 

PATENT APPLICATION: US/10/761 , 006 TIME: 14:36:39 

Input Set : A:\seqlist.txt 

Output Set: N:\CRF4\01302004\J761006.raw 

L:31 M:220 C: Keyword misspelled or invalid format, [(A) APPLICATION NUMBER: ] 
L:32 M:220 C: Keyword misspelled or invalid format, [(B) FILINGy^ATE : ] 
L:60 M:254 <£: No. of Bases conflict, Input : 0 Counted: 50 SEQrl^^ 
M:254 Repeated in SeqNo=l 



The type of errors shown exist throughout 
the Sequence Listing. Please check subsequen t 
sequences for similar errors. 
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